Frequent Pattern Mining in Attributed Trees
نویسندگان
چکیده
Frequent pattern mining is an important data mining task with a broad range of applications. Initially focused on the discovery of frequent itemsets, studies were extended to mine structural forms like sequences, trees or graphs. In this paper, we introduce a new data mining method that consists in mining new kind of patterns in a collection of attributed trees (atrees). Attributed trees are trees in which vertices are associated with itemsets. Mining this type of patterns (called asubtrees), which combines tree mining and itemset mining, requires the exploration of a huge search space. We present several new algorithms for attributed trees mining and show that their implementations can efficiently list frequent patterns in a database of several thousand of attributed trees.
منابع مشابه
Mining XML Frequent Query Patterns
With XML being the standard for data encoding and exchange over Internet, how to find the interesting XML query characteristic efficiently becomes a critical issue. Mining frequent query pattern is a technique to discover the most frequently occurring query pattern trees from a large collection of XML queries. In this paper, we describe an efficient mining algorithm to discover the frequent que...
متن کاملCanonical Forms for Labeled Trees and Their Applications in Frequent Subtree Mining
Tree structures are used extensively in domains such as computational biology, pattern recognition, XML databases, computer networks, and so on. In this paper, we first present two canonical forms for labeled rooted unordered trees–the breadth-first canonical form (BFCF) and the depth-first canonical form (DFCF). Then the canonical forms are applied to the frequent subtree mining problem. Based...
متن کاملIndexing and Mining Free Trees
Tree structures are used extensively in domains such as computational biology, pattern recognition, computer networks, and so on. In this paper, we present an indexing technique for free trees and apply this indexing technique to the problem of mining frequent subtrees. We first define a novel representation, the canonical form, for rooted trees and extend the definition to free trees. We also ...
متن کاملMining of Users’ Access Behaviour for Frequent Sequential Pattern from Web Logs
Sequential Pattern mining is the process of applying data mining techniques to a sequential database for the purposes of discovering the correlation relationships that exist among an ordered list of events. The task of discovering frequent sequences is challenging, because the algorithm needs to process a combinatorially explosive number of possible sequences. Discovering hidden information fro...
متن کاملMining Frequent Rooted Trees and Free Trees Using Canonical Forms
Tree structures are used extensively in domains such as computational biology, pattern recognition, XML databases, computer networks, and so on. In this paper, we present HybridTreeMiner, a computationally efficient algorithm that discovers all frequently occurring subtrees in a database of rooted unordered trees. The algorithm mines frequent subtrees by traversing an enumeration tree that syst...
متن کامل